The Minimum Description Length Principle in Coding and Modeling
نویسندگان
چکیده
We review the principles of Minimum Description Length and Stochastic Complexity as used in data compression and statistical modeling. Stochastic complexity is formulated as the solution to optimum universal coding problems extending Shannon’s basic source coding theorem. The normalized maximized likelihood, mixture, and predictive codings are each shown to achieve the stochastic complexity to within asymptotically vanishing terms. We assess the performance of the minimum description length criterion both from the vantage point of quality of data compression and accuracy of statistical inference. Context tree modeling, density estimation, and model selection in Gaussian linear regression serve as examples.
منابع مشابه
Iterated logarithmic expansions of the pathwise code lengths for exponential families
Rissanen's Minimum Description Length (MDL) principle is a statistical modeling principle motivated by coding theory. For exponential families we obtain pathwise expansions, to the constant order, of the predictive and mixture code lengths used in MDL. The results are useful for understanding diierent MDL forms.
متن کاملOn the minimum description length principle for sources with piecewise constant parameters
Universal lossless coding in the presence of finitely many abrupt changes in the statistics of the source, at unknown points, is investigated. The minimum description length (MDL) principle is derived for this setting. In particular, it is shown that for any uniquely decipherable code, for almost every combination of statistical parameter vectors governing each segment, and for almost every vec...
متن کاملAdaptive partially hidden Markov models with application to bilevel image coding
Partially hidden Markov models (PHMMs) have previously been introduced. The transition and emission/output probabilities from hidden states, as known from the HMMs, are conditioned on the past. This way, the HMM may be applied to images introducing the dependencies of the second dimension by conditioning. In this paper, the PHMM is extended to multiple sequences with a multiple token version an...
متن کاملMinimum Description Length Induction, Bayesianism, and Kolmogorov Complexity
The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles minimum description length (MDL) and minimum message length (MML), abstracted as the ideal MDL principle and defined from Bayes’s rule by means of Kolmogorov complexity. The basic condition under which the ideal principle should be app...
متن کاملChapter 7 Asymptotics and Coding Theory : One of the n ! • Dimensions of Terry
Terry joined the Berkeley Statistics faculty in the summer of 1987 after being the statistics head of CSIRO in Australia. His office was just down the hallway from mine on the third floor of Evans. I was beginning my third year at Berkeley then and I remember talking to him in the hallway after a talk that he gave on information theory and the Minimum Description Length (MDL) Principle of Rissa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Information Theory
دوره 44 شماره
صفحات -
تاریخ انتشار 1998